47 research outputs found

    Plan To Predict: Learning an Uncertainty-Foreseeing Model for Model-Based Reinforcement Learning

    Full text link
    In Model-based Reinforcement Learning (MBRL), model learning is critical since an inaccurate model can bias policy learning via generating misleading samples. However, learning an accurate model can be difficult since the policy is continually updated and the induced distribution over visited states used for model learning shifts accordingly. Prior methods alleviate this issue by quantifying the uncertainty of model-generated samples. However, these methods only quantify the uncertainty passively after the samples were generated, rather than foreseeing the uncertainty before model trajectories fall into those highly uncertain regions. The resulting low-quality samples can induce unstable learning targets and hinder the optimization of the policy. Moreover, while being learned to minimize one-step prediction errors, the model is generally used to predict for multiple steps, leading to a mismatch between the objectives of model learning and model usage. To this end, we propose \emph{Plan To Predict} (P2P), an MBRL framework that treats the model rollout process as a sequential decision making problem by reversely considering the model as a decision maker and the current policy as the dynamics. In this way, the model can quickly adapt to the current policy and foresee the multi-step future uncertainty when generating trajectories. Theoretically, we show that the performance of P2P can be guaranteed by approximately optimizing a lower bound of the true environment return. Empirical results demonstrate that P2P achieves state-of-the-art performance on several challenging benchmark tasks.Comment: Accepted by NeurIPS202

    Safe Offline Reinforcement Learning with Real-Time Budget Constraints

    Full text link
    Aiming at promoting the safe real-world deployment of Reinforcement Learning (RL), research on safe RL has made significant progress in recent years. However, most existing works in the literature still focus on the online setting where risky violations of the safety budget are likely to be incurred during training. Besides, in many real-world applications, the learned policy is required to respond to dynamically determined safety budgets (i.e., constraint threshold) in real time. In this paper, we target at the above real-time budget constraint problem under the offline setting, and propose Trajectory-based REal-time Budget Inference (TREBI) as a novel solution that approaches this problem from the perspective of trajectory distribution. Theoretically, we prove an error bound of the estimation on the episodic reward and cost under the offline setting and thus provide a performance guarantee for TREBI. Empirical results on a wide range of simulation tasks and a real-world large-scale advertising application demonstrate the capability of TREBI in solving real-time budget constraint problems under offline settings.Comment: We propose a method to handle the constraint problem with dynamically determined safety budgets under the offline settin

    Identification of hub genes associated with hepatitis B virus-related hepatocellular cancer using weighted gene co-expression network analysis and protein-protein interaction network analysis

    Get PDF
    Background. Chronic hepatitis B virus (HBV) infection is the main pathogen of hepatocellular carcinoma. However, the mechanisms of HBV-related hepatocellular carcinoma (HCC) progression are practically unknown. Materials and Methods. The results of RNA-sequence and clinical data for GSE121248 and GSE17548 were accessed from the Gene Expression Omnibus data library. We screened Sangerbox 3.0 for differentially expressed genes (DEGs). The weighted gene co-expression network analysis (WGCNA) was employed to select core modules and hub genes, and protein-protein interaction network module analysis also played a significant part in it. Validation was performed using RNA-sequence data of cancer and normal tissues of HBV-related HCC patients in the cancer genome atlas-liver hepatocellular cancer database (TCGA-LIHC). Results. 787 DEGs were identified from GSE121248 and 772 DEGs were identified from GSE17548. WGCNA analysis indicated that black modules (99 genes) and grey modules (105 genes) were significantly associated with HBV-related HCC. Gene ontology analysis found that there is a direct correlation between DEGs and the regulation of cell movement and adhesion; the internal components and external packaging structure of plasma membrane; signaling receptor binding, calcium ion binding, etc. Kyoto Encyclopedia of Genes and Genomes pathway analysis found out the association between cytokine receptors, cytokine-cytokine receptor interactions, and viral protein interactions with cytokines were important and HBV-related HCC. Finally, we further validated 6 key genes including C7, EGR1, EGR3, FOS, FOSB, and prostaglandin-endoperoxide synthase 2 by using the TCGALIHC. Conclusions. We identified 6 hub genes as candidate biomarkers for HBV-related HCC. These hub genes may act as an essential part of HBV-related HCC progression

    Inhibiting Phase Transfer of Protein Nanoparticles by Surface Camouflage-A Versatile and Efficient Protein Encapsulation Strategy

    Get PDF
    Engineering a system with a high mass fraction of active ingredients, especially water-soluble proteins, is still an ongoing challenge. In this work, we developed a versatile surface camouflage strategy that can engineer systems with an ultrahigh mass fraction of proteins. By formulating protein molecules into nanoparticles, the demand of molecular modification was transformed into a surface camouflage of protein nanoparticles. Thanks to electrostatic attractions and van der Waals interactions, we camouflaged the surface of protein nanoparticles through the adsorption of carrier materials. The adsorption of carrier materials successfully inhibited the phase transfer of insulin, albumin, β-lactoglobulin, and ovalbumin nanoparticles. As a result, the obtained microcomposites featured with a record of protein encapsulation efficiencies near 100% and a record of protein mass fraction of 77%. After the encapsulation in microcomposites, the insulin revealed a hypoglycemic effect for at least 14 d with one single injection, while that of insulin solution was only ∼4 h.Peer reviewe

    The role of tripartite motif-containing 28 in cancer progression and its therapeutic potentials

    Get PDF
    Tripartite motif-containing 28 (TRIM28) belongs to tripartite motif (TRIM) family. TRIM28 not only binds and degrades its downstream target, but also acts as a transcription co-factor to inhibit gene expression. More and more studies have shown that TRIM28 plays a vital role in tumor genesis and progression. Here, we reviewed the role of TRIM28 in tumor proliferation, migration, invasion and cell death. Moreover, we also summarized the important role of TRIM28 in tumor stemness sustainability and immune regulation. Because of the importance of TRIM28 in tumors, TIRM28 may be a candidate target for anti-tumor therapy and play an important role in tumor diagnosis and treatment in the future

    Intelligent Model for Dynamic Shear Modulus and Damping Ratio of Undisturbed Marine Clay Based on Back-Propagation Neural Network

    No full text
    In this study, a series of resonant-column experiments were conducted on marine clays from Bohai Bay and Hangzhou Bay, China. The characteristics of the dynamic shear modulus (G) and damping ratio (D) of these marine clays were examined. It was found that G and D not only vary with shear strain (γ), but they also have a strong connection with soil depth (H) (reflected by the mean effective confining pressure (σm) in the laboratory test conditions). With increasing H (σm) and fixed γ, the value of G gradually increases; conversely, the value of D gradually decreases, and this is accompanied by the weakening of the decay or growth rate. An intelligent model based on a back-propagation neural network (BPNN) was developed for the calculation of these parameters. Compared with existing function models, the proposed intelligent model avoids the forward propagation of data errors and the need for human intervention regarding the fitting parameters. The model can accurately predict the G and D characteristics of marine clays at different H (σm) and the corresponding γ. The prediction accuracy is universal and does not strictly depend on the number of neurons in the hidden layer of the neural network

    Shunt Damping Vibration Control Technology: A Review

    No full text
    Smart materials and structures have attracted a significant amount of attention for their vibration control potential in engineering applications. Compared to the traditional active technique, shunt damping utilizes an external circuit across the terminals of smart structure based transducers to realize vibration control. Transducers can simultaneously serve as an actuator and a sensor. Such unique advantage offers a great potential for designing sensorless devices to be used in structural vibration control and reduction engineering. The present literature combines piezoelectric shunt damping (PSD) and electromagnetic shunt damping (EMSD), establishes a unified governing equation of PSD and EMSD, and reports the unique vibration control performance of these shunts. The schematic of shunt circuits is given and demonstrated, and some common control principles and equations of these shunts are summarized. Finally, challenges and perspective of the shunt damping technology are discussed, and suggestions made based on the knowledge and experience of the authors

    Models as Agents: Optimizing Multi-Step Predictions of Interactive Local Models in Model-Based Multi-Agent Reinforcement Learning

    No full text
    Research in model-based reinforcement learning has made significant progress in recent years. Compared to single-agent settings, the exponential dimension growth of the joint state-action space in multi-agent systems dramatically increases the complexity of the environment dynamics, which makes it infeasible to learn an accurate global model and thus necessitates the use of agent-wise local models. However, during multi-step model rollouts, the prediction of one local model can affect the predictions of other local models in the next step. As a result, local prediction errors can be propagated to other localities and eventually give rise to considerably large global errors. Furthermore, since the models are generally used to predict for multiple steps, simply minimizing one-step prediction errors regardless of their long-term effect on other models may further aggravate the propagation of local errors. To this end, we propose Models as AGents (MAG), a multi-agent model optimization framework that reversely treats the local models as multi-step decision making agents and the current policies as the dynamics during the model rollout process. In this way, the local models are able to consider the multi-step mutual affect between each other before making predictions. Theoretically, we show that the objective of MAG is approximately equivalent to maximizing a lower bound of the true environment return. Experiments on the challenging StarCraft II benchmark demonstrate the effectiveness of MAG

    RESEARCH ON BOLT CONNECTION LOOSE MECHANISM UNDER DYNAMIC TENSION COMPRESSION AND SHEAR LOAD

    No full text
    Aiming at the loosening of bolt connection in the rotating mechanism under dynamic tension compression and shear load,the failure mechanism and its influencing factors are studied by using virtual prototyping technology and finite element analysis method. First,based on the virtual prototype technology,the load of failure bolt connection structure is extracted;Secondly,the finite element method is used to analyze the influence factors of bolt number,preload,thickness of the connecting piece and spring gasket. The evaluation index system of digital analysis is set up to guide the design of the measures for improving the anti loosening. In this paper,the design and research on the bolt of a servo device rotating mechanism is taken as an example. The method of digital design and analysis is carried out,which solves the problem of bolt connection loose. The conclusion shows that the improved design is consistent with the practical engineering,and bolt connection structure is free from loosening and failure

    Polypyrrole-Coated Low-Crystallinity Iron Oxide Grown on Carbon Cloth Enabling Enhanced Electrochemical Supercapacitor Performance

    No full text
    It is highly attractive to design pseudocapacitive metal oxides as anodes for supercapacitors (SCs). However, as they have poor conductivity and lack active sites, they generally exhibit an unsatisfied capacitance under high current density. Herein, polypyrrole-coated low-crystallinity Fe2O3 supported on carbon cloth (D-Fe2O3@PPy/CC) was prepared by chemical reduction and electrodeposition methods. The low-crystallinity Fe2O3 nanorod achieved using a NaBH4 treatment offered more active sites and enhanced the Faradaic reaction in surface or near-surface regions. The construction of a PPy layer gave more charge storage at the Fe2O3/PPy interface, favoring the limitation of the volume effect derived from Na+ transfer in the bulk phase. Consequently, D-Fe2O3@PPy/CC displayed enhanced capacitance and stability. In 1 M Na2SO4, it showed a specific capacitance of 615 mF cm−2 (640 F g−1) at 1 mA cm−2 and still retained 79.3% of its initial capacitance at 10 mA cm−2 after 5000 cycles. The design of low-crystallinity metal oxides and polymer nanocomposites is expected to be widely applicable for the development of state-of-the-art electrodes, thus opening new avenues for energy storage
    corecore